Comparative linguistics

Comparative linguistics (originally comparative philology) is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.

Genetic relatedness implies a common origin or proto-language, and comparative linguistics aims to construct language families, to reconstruct proto-languages and specify the changes that have resulted in the documented languages. To maintain a clear distinction between attested and reconstructed forms, comparative linguists prefix an asterisk to any form that is not found in surviving texts. A number of methods for carrying out language classification have been developed, ranging from simple inspection to computerised hypothesis testing. Such methods have gone through a long process of development.

Contents

Methods

The fundamental technique of comparative linguistics is to compare phonological systems, morphological systems, syntax and the lexicon of two or more languages using techniques such as the comparative method. In principle, every difference between two related languages should be explicable to a high degree of plausibility, and systematic changes, for example in phonological or morphological systems, are expected to be highly regular (i.e. consistent). In practice, the comparison may be more restricted, e.g. just to the lexicon. In some methods it may be possible to reconstruct an earlier proto-language. Although the proto-languages reconstructed by the comparative method are hypothetical, a reconstruction may have predictive power. The most notable example of this is Saussure's proposal that the Indo-European consonant system contained laryngeals, a type of consonant attested in no Indo-European language known at the time. The hypothesis was vindicated with the discovery of Hittite, which proved to have exactly the consonants Saussure had hypothesized in the environments he had predicted.

Where languages are derived from a very distant ancestor, and are thus more distantly related, the comparative method becomes impracticable. In particular, attempting to relate two reconstructed proto-languages by the comparative method has not generally produced results that have met with wide acceptance. The method has also not been very good at unambiguously identifying sub-families and different scholars have produced conflicting results, for example in Indo-European. A number of methods based on statistical analysis of vocabulary have been developed to try and overcome this limitation, such as lexicostatistics and mass comparison. The former uses lexical cognates like the comparative method but the latter uses only lexical similarity. The theoretical basis of such methods is that vocabulary items can be matched without a detailed language reconstruction and that comparing enough vocabulary items will negate individual inaccuracies. Thus they can be used to determine relatedness but not to determine the proto-language.

History

The earliest method of this type was the comparative method, which was developed over many years, culminating in the nineteenth century. This uses a long word list and detailed study. However, it has been criticized for example as being subjective, being informal and lacking testability. [1] The comparative method uses information from two or more languages and allows reconstruction of the ancestral language. The method of Internal reconstruction uses only a single language, with comparison of word variants, to perform the same function. Internal reconstruction is more resistant to interference but usually has a limited available base of utilizable words and is able to reconstruct only certain changes (those that have left traces as morphophonological variations).

In the twentieth century an alternative method, lexicostatistics, was developed, which is mainly associated with Morris Swadesh but is based on earlier work. This uses a short word list of basic vocabulary in the various languages for comparisons. Swadesh used 100 (earlier 200) items that are assumed to be cognate (on the basis of phonetic similarity) in the languages being compared, though other lists have also been used. Distance measures are derived by examination of language pairs but such methods reduce the information. An outgrowth of lexicostatistics is glottochronology, initially developed in the 1950s, which proposed a mathematical formula for establishing the date when two languages separated, based on percentage of a core vocabulary of culturally independent words. In its simplest form a constant rate of change is assumed, though later versions allow variance but still fail to achieve reliability. Glottochronology has met with mounting scepticism, and is seldom applied today. Dating estimates can now be generated by computerised methods that have less restrictions, calculating rates from the data. However, no mathematical means of producing proto-language split-times on the basis of lexical retention has been proven reliable.

Another controversial method, developed by Joseph Greenberg, is mass comparison.[2] The method, which disavows any ability to date developments, aims simply to show which languages are more and less close to each other. On the one hand, since mass comparison eschews the establishment of regular changes, it is flatly rejected by the majority of historical linguists. On the other hand, the method has been shown to be useful in preliminary grouping of languages known to be related, when such findings are backed up by in-depth comparative analysis.

Recently, computerised statistical hypothesis testing methods have been developed which are related to both the comparative method and lexicostatistics. Character based methods are similar to the former and distanced based methods are similar to the latter (see Quantitative comparative linguistics). The characters used can be morphological or grammatical as well as lexical. Since the mid-1990s these more sophisticated tree- and network-based cladistic methods have been used to investigate the relationships between languages and to determine approximate dates for proto-languages. These are considered by many to show promise but are not wholly accepted by traditionalists. [3] However, they are not intended to replace older methods but to supplement them. Such statistical methods cannot be used to derive the features of a proto-language, apart from the fact of the existence of shared items of the compared vocabulary. These approaches have been challenged for their methodological problems, since without a reconstruction or at least a detailed list of phonological correspondences there can be no demonstration that two words in different languages are cognate.

Furthermore, the comparative method is unhelpful in the case of "multiple causation",[4] when a lexical item derives from several sources simultaneously as in phono-semantic matching.[5]

Other related fields

There are other branches of linguistics that involve comparing languages, which are not, however, part of comparative linguistics:

There is also a wide body of publications containing language comparisons that are considered pseudoscientific by linguists; see pseudoscientific language comparison.

See also

References

  1. See for example "Language Classification by Numbers" by April McMahon and Robert McMahon
  2. Campbell, Lyle (2004). Historical Linguistics: An Introduction (2nd ed.). Cambridge: The MIT Press
  3. See for example the criticisms of Gray and Atkinson's work in Language Log, 10 December 2003
  4. Zuckermann, Ghil'ad (2009), Hybridity versus Revivability: Multiple Causation, Forms and Patterns, Journal of Language Contact, Varia 2: 40-67.
  5. Zuckermann, Ghil'ad (2003), ‘‘Language Contact and Lexical Enrichment in Israeli Hebrew’’, Houndmills: Palgrave Macmillan, (Palgrave Studies in Language History and Language Change, Series editor: Charles Jones). ISBN 1-4039-1723-X.

Bibliography